AITopics | probabilistic analysis

A Unifying View of Optimism in Episodic Reinforcement Learning

Neural Information Processing SystemsDec-23-2025, 18:27:48 GMT

In this paper we provide a general framework for designing, analyzing and implementing such algorithms in the episodic reinforcement learning problem. This framework is built upon Lagrangian duality, and demonstrates that every model-optimistic algorithm that constructs an optimistic MDP has an equivalent representation as a value-optimistic dynamic programming algorithm. Typically, it was thought that these two classes of algorithms were distinct, with model-optimistic algorithms benefiting from a cleaner probabilistic analysis while value-optimistic algorithms are easier to implement and thus more practical. With the framework developed in this paper, we show that it is possible to get the best of both worlds by providing a class of algorithms which have a computationally efficient dynamic-programming implementation and also a simple probabilistic analysis. Besides being able to capture many existing algorithms in the tabular setting, our framework can also address large-scale problems under realizable function approximation, where it enables a simple model-based analysis of some recently proposed methods.

artificial intelligence, machine learning, reinforcement learning, (6 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (0.84)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.47)

Add feedback

Beyond Worst-case: A Probabilistic Analysis of Affine Policies in Dynamic Optimization

Neural Information Processing SystemsNov-21-2025, 16:21:45 GMT

Affine policies (or control) are widely used as a solution approach in dynamic optimization where computing an optimal adjustable solution is usually intractable. While the worst case performance of affine policies can be significantly bad, the empirical performance is observed to be near-optimal for a large class of problem instances. For instance, in the two-stage dynamic robust optimization problem with linear covering constraints and uncertain right hand side, the worst-case approximation bound for affine policies is $O(\sqrt m)$ that is also tight (see Bertsimas and Goyal (2012)), whereas observed empirical performance is near-optimal. In this paper, we aim to address this stark-contrast between the worst-case and the empirical performance of affine policies. In particular, we show that affine policies give a good approximation for the two-stage adjustable robust optimization problem with high probability on random instances where the constraint coefficients are generated i.i.d.

affine policy, empirical performance, probabilistic analysis, (9 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence (0.79)

Add feedback

Probabilistic Analysis of Copyright Disputes and Generative AI Safety

Chiba-Okabe, Hiroaki

arXiv.org Artificial IntelligenceDec-2-2024

This paper presents a probabilistic approach to analyzing copyright infringement disputes by formalizing relevant judicial principles within a coherent framework based on the random-worlds method. It provides a structured analysis of key evidentiary principles, with a particular focus on the ``inverse ratio rule"--a controversial doctrine adopted by some courts. Although this rule has faced significant criticism, a formal proof demonstrates its validity, provided it is properly defined. Additionally, the paper examines the heightened copyright risks posed by generative AI, highlighting how extensive access to copyrighted material by generative models increases the risk of infringement. Utilizing the probabilistic approach, the Near Access-Free (NAF) condition, previously proposed as a potential mitigation strategy, is evaluated. The analysis reveals that while the NAF condition mitigates some infringement risks, its justifiability and efficacy are questionable in certain contexts. These findings demonstrate how a rigorous probabilistic approach can advance our understanding of copyright jurisprudence and its interaction with emerging technologies.

dispute and generative ai safety, probabilistic analysis

arXiv.org Artificial Intelligence

2410.00475

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Generation (0.89)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.60)

Add feedback

A Unifying View of Optimism in Episodic Reinforcement Learning

Neural Information Processing SystemsOct-9-2024, 13:08:09 GMT

In this paper we provide a general framework for designing, analyzing and implementing such algorithms in the episodic reinforcement learning problem. This framework is built upon Lagrangian duality, and demonstrates that every model-optimistic algorithm that constructs an optimistic MDP has an equivalent representation as a value-optimistic dynamic programming algorithm. Typically, it was thought that these two classes of algorithms were distinct, with model-optimistic algorithms benefiting from a cleaner probabilistic analysis while value-optimistic algorithms are easier to implement and thus more practical. With the framework developed in this paper, we show that it is possible to get the best of both worlds by providing a class of algorithms which have a computationally efficient dynamic-programming implementation and also a simple probabilistic analysis. Besides being able to capture many existing algorithms in the tabular setting, our framework can also address large-scale problems under realizable function approximation, where it enables a simple model-based analysis of some recently proposed methods.

algorithm, episodic reinforcement learning, unifying view, (3 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.85)

Add feedback

A probabilistic analysis of shotgun sequencing for metagenomics

#artificialintelligenceJan-14-2022, 01:21:46 GMT

Genome sequencing is the basis for many modern biological and medicinal studies. With recent technological advances, metagenomics has become a problem of interest. This problem entails the analysis and reconstruction of multiple DNA sequences from different sources. Shotgun genome sequencing works by breaking up long DNA sequences into shorter segments called reads. Given this collection of reads, one would like to reconstruct the original collection of DNA sequences. For experimental design in metagenomics, it is important to understand how the minimal read length necessary for reliable reconstruction depends on the number and characteristics of the genomes involved. Utilizing simple probabilistic models for each DNA sequence, we analyze the identifiability of collections of M genomes of length N in an asymptotic regime in which N tends to infinity and M may grow with N. Our first main result provides a threshold in terms of M and N so that if the read length exceeds the threshold, then a simple greedy algorithm successfully reconstructs the full collection of genomes with probability tending to one. Our second main result establishes a lower threshold in terms of M and N such that if the read length is shorter than the threshold, then reconstruction of the full collection of genomes is impossible with probability tending to one.

probabilistic analysis, shotgun

#artificialintelligence

Genre: Research Report (0.69)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning (0.53)

Add feedback

Beyond Worst-case: A Probabilistic Analysis of Affine Policies in Dynamic Optimization

Housni, Omar El, Goyal, Vineet

Neural Information Processing SystemsFeb-14-2020, 16:12:20 GMT

Affine policies (or control) are widely used as a solution approach in dynamic optimization where computing an optimal adjustable solution is usually intractable. While the worst case performance of affine policies can be significantly bad, the empirical performance is observed to be near-optimal for a large class of problem instances. For instance, in the two-stage dynamic robust optimization problem with linear covering constraints and uncertain right hand side, the worst-case approximation bound for affine policies is $O(\sqrt m)$ that is also tight (see Bertsimas and Goyal (2012)), whereas observed empirical performance is near-optimal. In this paper, we aim to address this stark-contrast between the worst-case and the empirical performance of affine policies. In particular, we show that affine policies give a good approximation for the two-stage adjustable robust optimization problem with high probability on random instances where the constraint coefficients are generated i.i.d.

affine policy, empirical performance, probabilistic analysis, (6 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence (0.87)

Add feedback

Making Decisions with Belief Functions

Strat, Thomas M.

arXiv.org Artificial IntelligenceMar-27-2013

A primary motivation for reasoning under uncertainty is to derive decisions in the face of inconclusive evidence. However, Shafer's theory of belief functions, which explicitly represents the underconstrained nature of many reasoning problems, lacks a formal procedure for making decisions. Clearly, when sufficient information is not available, no theory can prescribe actions without making additional assumptions. Faced with this situation, some assumption must be made if a clearly superior choice is to emerge. In this paper we offer a probabilistic interpretation of a simple assumption that disambiguates decision problems represented with belief functions. We prove that it yields expected values identical to those obtained by a probabilistic analysis that makes the same assumption. In addition, we show how the decision analysis methodology frequently employed in probabilistic reasoning can be extended for use with belief functions.

belief function, upstream oil & gas, us government, (18 more...)

arXiv.org Artificial Intelligence

1304.1531

Country: